Detecting high-order epistasis in nonlinear
نویسندگان
چکیده
7 High-order epistasis has been observed in many genotype-phenotype maps. These multi-way inter8 actions could have profound implications for evolution and may be useful for dissecting complex 9 traits. Previous analyses have assumed a linear genotype-phenotype map, and then applied a linear 10 high-order epistasis model to dissect epistasis. The assumption of linearity has not been tested 11 in most of these data sets. Using simulations, we demonstrate that neglecting nonlinearity leads 12 to spurious high-order epistasis. We find we can account for this nonlinearity in simulated maps 13 using a power transform. We then measure and account for nonlinearity in experimental maps 14 for which high-order epistasis has been previously reported. When applied to seven experimen15 tal genotype-phenotype maps, we find that five of the seven exhibited nonlinearity. Correcting 16 for this nonlinearity had a large effect on the magnitudes and signs of the estimated high-order 17 epistatic coefficients, but only a minor effect on additive and pairwise epistatic coefficients. Even 18 after accounting for nonlinearity, we found statistically significant fourth-order epistasis in every 19 map studied. One map even exhibited fifth-order epistasis. The contributions of high-order epis20 tasis to the total variation in the map ranged from 2.2% to 31.0%, with an average across maps 21 of 12.7%. Our work describes a simple method to account for nonlinearity in binary genotype22 phenotype maps. Further, it provides strong evidence for extensive high-order epistasis, even after 23 nonlinearity is taken into account. 24 2 . CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/072256 doi: bioRxiv preprint first posted online Aug. 30, 2016;
منابع مشابه
Detecting High-Order Epistasis in Nonlinear Genotype-Phenotype Maps
High-order epistasis has been observed in many genotype-phenotype maps. These multi-way interactions between mutations may be useful for dissecting complex traits and could have profound implications for evolution. Alternatively, they could be a statistical artifact. High-order epistasis models assume the effects of mutations should add, when they could in fact multiply or combine in some other...
متن کاملAn information-gain approach to detecting three-way epistatic interactions in genetic association studies
BACKGROUND Epistasis has been historically used to describe the phenomenon that the effect of a given gene on a phenotype can be dependent on one or more other genes, and is an essential element for understanding the association between genetic and phenotypic variations. Quantifying epistasis of orders higher than two is very challenging due to both the computational complexity of enumerating a...
متن کاملInferring the shape of global epistasis
Genotype-phenotype relationships are notoriously complicated. Idiosyncratic interactions between specific combinations of mutations occur, and are difficult to predict. Yet it is increasingly clear that many interactions can be understood in terms of global epistasis. That is, mutations may act additively on some underlying, unobserved trait, and this trait is then transformed via a nonlinear f...
متن کاملHigh-order epistasis shapes evolutionary trajectories
High-order epistasis-where the effect of a mutation is determined by interactions with two or more other mutations-makes small, but detectable, contributions to genotype-fitness maps. While epistasis between pairs of mutations is known to be an important determinant of evolutionary trajectories, the evolutionary consequences of high-order epistasis remain poorly understood. To determine the eff...
متن کاملA Cellular Automata Approach to Detecting Interactions Among Single-nucleotide Polymorphisms in Complex Multifactorial Diseases
The identification and characterization of susceptibility genes for common complex multifactorial human diseases remains a statistical and computational challenge. Parametric statistical methods such as logistic regression are limited in their ability to identify genes whose effects are dependent solely or partially on interactions with other genes and environmental exposures. We introduce cell...
متن کامل